22 research outputs found
Privacy-preserving parametric inference: a case for robust statistics
Differential privacy is a cryptographically-motivated approach to privacy
that has become a very active field of research over the last decade in
theoretical computer science and machine learning. In this paradigm one assumes
there is a trusted curator who holds the data of individuals in a database and
the goal of privacy is to simultaneously protect individual data while allowing
the release of global characteristics of the database. In this setting we
introduce a general framework for parametric inference with differential
privacy guarantees. We first obtain differentially private estimators based on
bounded influence M-estimators by leveraging their gross-error sensitivity in
the calibration of a noise term added to them in order to ensure privacy. We
then show how a similar construction can also be applied to construct
differentially private test statistics analogous to the Wald, score and
likelihood ratio tests. We provide statistical guarantees for all our proposals
via an asymptotic analysis. An interesting consequence of our results is to
further clarify the connection between differential privacy and robust
statistics. In particular, we demonstrate that differential privacy is a weaker
stability requirement than infinitesimal robustness, and show that robust
M-estimators can be easily randomized in order to guarantee both differential
privacy and robustness towards the presence of contaminated data. We illustrate
our results both on simulated and real data
Centrality measures for graphons: Accounting for uncertainty in networks
As relational datasets modeled as graphs keep increasing in size and their
data-acquisition is permeated by uncertainty, graph-based analysis techniques
can become computationally and conceptually challenging. In particular, node
centrality measures rely on the assumption that the graph is perfectly known --
a premise not necessarily fulfilled for large, uncertain networks. Accordingly,
centrality measures may fail to faithfully extract the importance of nodes in
the presence of uncertainty. To mitigate these problems, we suggest a
statistical approach based on graphon theory: we introduce formal definitions
of centrality measures for graphons and establish their connections to
classical graph centrality measures. A key advantage of this approach is that
centrality measures defined at the modeling level of graphons are inherently
robust to stochastic variations of specific graph realizations. Using the
theory of linear integral operators, we define degree, eigenvector, Katz and
PageRank centrality functions for graphons and establish concentration
inequalities demonstrating that graphon centrality functions arise naturally as
limits of their counterparts defined on sequences of graphs of increasing size.
The same concentration inequalities also provide high-probability bounds
between the graphon centrality functions and the centrality measures on any
sampled graph, thereby establishing a measure of uncertainty of the measured
centrality score. The same concentration inequalities also provide
high-probability bounds between the graphon centrality functions and the
centrality measures on any sampled graph, thereby establishing a measure of
uncertainty of the measured centrality score.Comment: Authors ordered alphabetically, all authors contributed equally. 21
pages, 7 figure
Kernel PCA for multivariate extremes
We propose kernel PCA as a method for analyzing the dependence structure of
multivariate extremes and demonstrate that it can be a powerful tool for
clustering and dimension reduction. Our work provides some theoretical insight
into the preimages obtained by kernel PCA, demonstrating that under certain
conditions they can effectively identify clusters in the data. We build on
these new insights to characterize rigorously the performance of kernel PCA
based on an extremal sample, i.e., the angular part of random vectors for which
the radius exceeds a large threshold. More specifically, we focus on the
asymptotic dependence of multivariate extremes characterized by the angular or
spectral measure in extreme value theory and provide a careful analysis in the
case where the extremes are generated from a linear factor model. We give
theoretical guarantees on the performance of kernel PCA preimages of such
extremes by leveraging their asymptotic distribution together with Davis-Kahan
perturbation bounds. Our theoretical findings are complemented with numerical
experiments illustrating the finite sample performance of our methods
Differentially private inference via noisy optimization
We propose a general optimization-based framework for computing
differentially private M-estimators and a new method for constructing
differentially private confidence regions. Firstly, we show that robust
statistics can be used in conjunction with noisy gradient descent or noisy
Newton methods in order to obtain optimal private estimators with global linear
or quadratic convergence, respectively. We establish local and global
convergence guarantees, under both local strong convexity and self-concordance,
showing that our private estimators converge with high probability to a nearly
optimal neighborhood of the non-private M-estimators. Secondly, we tackle the
problem of parametric inference by constructing differentially private
estimators of the asymptotic variance of our private M-estimators. This
naturally leads to approximate pivotal statistics for constructing confidence
regions and conducting hypothesis testing. We demonstrate the effectiveness of
a bias correction that leads to enhanced small-sample empirical performance in
simulations. We illustrate the benefits of our methods in several numerical
examples
Reducing the environmental impact of surgery on a global scale: systematic review and co-prioritization with healthcare workers in 132 countries
Abstract
Background
Healthcare cannot achieve net-zero carbon without addressing operating theatres. The aim of this study was to prioritize feasible interventions to reduce the environmental impact of operating theatres.
Methods
This study adopted a four-phase Delphi consensus co-prioritization methodology. In phase 1, a systematic review of published interventions and global consultation of perioperative healthcare professionals were used to longlist interventions. In phase 2, iterative thematic analysis consolidated comparable interventions into a shortlist. In phase 3, the shortlist was co-prioritized based on patient and clinician views on acceptability, feasibility, and safety. In phase 4, ranked lists of interventions were presented by their relevance to high-income countries and lowâmiddle-income countries.
Results
In phase 1, 43 interventions were identified, which had low uptake in practice according to 3042 professionals globally. In phase 2, a shortlist of 15 intervention domains was generated. In phase 3, interventions were deemed acceptable for more than 90 per cent of patients except for reducing general anaesthesia (84 per cent) and re-sterilization of âsingle-useâ consumables (86 per cent). In phase 4, the top three shortlisted interventions for high-income countries were: introducing recycling; reducing use of anaesthetic gases; and appropriate clinical waste processing. In phase 4, the top three shortlisted interventions for lowâmiddle-income countries were: introducing reusable surgical devices; reducing use of consumables; and reducing the use of general anaesthesia.
Conclusion
This is a step toward environmentally sustainable operating environments with actionable interventions applicable to both highâ and lowâmiddleâincome countries
Robust penalized M-estimators
Data sets where the number of variables p is comparable to or larger than the number of observations n arise frequently nowadays in a large variety of fields. High dimensional statistics has played a key role in the analysis of such data and much progress has been achieved over the last two decades in this domain. Most of the existing procedures are likelihood based and therefore quite sensitive to deviations from the stochastic assumptions. We study robust penalized M-estimators and discuss some of their formal robustness properties. In the context of high dimensional generalized linear models we provide oracle properties for our proposals. We discuss some strategies for the selection of the tuning parameter and extensions to generalized additive models. We illustrate the behavior of our estimators in a simulation study.Non UBCUnreviewedAuthor affiliation: University of GenevaGraduat